Skiing Video Analysis¶
The purpose of this notebook is
- Refine the business problem statement
- Understand foundational computer vision technologies that might be usable for this problem
- Evaluate how well OpenPose can identify body keypoints and create skeletal tracking information from different types of helmet video cameras.
- Inform what next steps to take
This notebook is created based on this Kaggle notebook https://www.kaggle.com/code/rkuo2000/openpose-pytorch whose github repo is: https://github.com/Hzzone/pytorch-openpose
Improving Ski Technique: The Challenges of Effective Practicing¶
Mastering advanced skiing technique requires developing unfamiliar body positions through repetitive practice that develops muscle memory. However, several unique constraints limit effective practice time:
- Seasonal limitations (4-5 months per year)
- Limited active practice (~20% skiing vs. ~80% lift time)
- High financial costs compared to other sports
- Need for external video recording assistance
These constraints are significant - even Mikaela Shiffrin, widely considered the greatest skier of all time, gets only about 7 minutes of actual practice time during a 6-hour training session (12-15 runs).
Practice Enhancement Options
Skiers dedicated to improvement currently have three main options:
- Hiring a ski coach (expensive and time-limited)
- Using technology like CARV (which may prioritize metrics over proper technique)
- Self-coaching (requires strong body awareness and video analysis)
The Self-Coaching Challenge Using Video
Self-coaching through video analysis could significantly improve practice efficiency through an iterative cycle on every run:
- Identify technique focus areas
- Select appropriate drills
- Practice
- Review video
- Refine approach for the next run
However, current helmet camera technology produces unusable footage for self-analysis due to:
- Body part distortions
- Fish-eye effects
- Depth perception issues
- Occlusions
The Opportunity: Maximize Limited Practice Time on the Slope
Dedicated skiers would benefit significantly from tools that increase practice productivity, particularly through video-assisted self-coaching.
The development of better video analysis capabilities could transform every run into a valuable learning opportunity, maximizing limited practice time on the slopes.
Ideal State Video Perspective¶
The ideal video perspective for skier analysis is taken by a third party¶
The following image and video represent the ideal state video perspective that is required to analyze ski performance.
- Video taken from downhill from the skier.
- Skiers joints are visible and angles between joints can be clearly identified without distortion
- Body parts and limb lengths appear proportionally sized.
- Skier is centered in the video
Obtaining this kind of video requires a second person to film the skier. Thus, a skier will only be able to get this kind of video infrequently.
from IPython.display import HTML
HTML("""
<div style="display: flex; justify-content: center; align-items: center;">
<div style="text-align: center; margin-top: 10px;">
<b>Skier Video</b>
<video width="440" height="390" controls>
<source src="https://mattconners.github.io/docs/skivision/ideal_ski_pov.mov" type="video/mp4">
Your browser does not support the video tag.
</video>
</div>
<div style="margin-left: 20px; text-align: center;">
<b>Analysis that can be done on this video</b>
<img src="https://mattconners.github.io/docs/skivision/Skier_Analysis_.jpg" width="350" height="350" alt="Description of the image">
</div>
</div>
""")
Current State Video Perspective¶
Video perspectives available with current helmet cameras
- There are two types of helmet camera mounts and both create video with significant perspective distortions
- These perspective distortions make it unusable for ski performance analysis
- Unicorn mount helmet camera footage is better than flush mount camerage footage, but still inadequate for ski performance analysis
from IPython.display import HTML
HTML("""
<table style="border-collapse: collapse; width: 100%;">
<tr>
<th style="text-align: center;">2 Helmet Mount Types</th>
<th style="text-align: center;">Flush Mount Helmet Camera</th>
<th style="text-align: center;">Unicorn Mount Helmet Camera</th>
</tr>
<tr>
<td style="text-align: center;"><b>Helmet Mount Example</b></td>
<td style="text-align: center;"><img src="https://mattconners.github.io/docs/skivision/helemt_flushmount.jpeg" width="150" alt="Description of image 1"></td>
<td style="text-align: center;"><img src="https://mattconners.github.io/docs/skivision/UnicornMount.jpeg" width="150" alt="Description of image 2"></td>
</tr>
<tr>
<td style="text-align: center;"><b>Skier Image Example</b></td>
<td style="text-align: center;"><img src="https://mattconners.github.io/docs/skivision/helmet3.jpg" width="150" alt="Description of image 1"></td>
<td style="text-align: center;"><img src="https://mattconners.github.io/docs/skivision/Unicorn8.jpg" width="150" alt="Description of image 2"></td>
</tr>
<tr>
<td style="text-align: center;"><b>Skier Video Example</b></td>
<td style="text-align: center;">
<video width="300" height="200" controls>
<source src="https://mattconners.github.io/docs/skivision/helmet_flush_groomed.mov" type="video/mp4">
Your browser does not support the video tag.
</video>
</td>
<td style="text-align: center;">
<video width="300" height="200" controls>
<source src="https://mattconners.github.io/docs/skivision/unicorn_trees1_trimmed.mp4" type="video/mp4">
Your browser does not support the video tag.
</video>
</td>
</tr>
<tr>
<td style="text-align: left;"><b><u>Perspective Distortion Challenges</u></b>
<ul>
<li><b>Body part distortion</b></li>
<li><b>Fish Eye effect distortion</b></li>
<li><b>Depth perception distortion</b></li>
<li><b>Occlusions of body parts</b></li>
</ul>
</td>
<td style="text-align: left;">
<p><b>Worse</b></p>
<ul>
<li><b>High:</b> head ~10x larger than feet</li>
<li><b>High:</b> poles and limbs curved</li>
<li><b>High:</b> leg length shorter than arm length</li>
<li><b>Medium:</b> esp. feet and shoulders</li>
</ul>
</td>
<td style="text-align: left;">
<p><b>Better</b></p>
<ul>
<li><b>Medium:</b> head ~5x larger than feet</li>
<li><b>High:</b> poles and limbs curved</li>
<li><b>High:</b> leg length shorter than arm length</li>
<li><b>Low:</b> only occasional, esp. feet</li>
</ul>
</td>
</tr>
</table>
<p>Note: I believe that the fish eye effect is caused by the Insta360 lens.</p>
""")
| 2 Helmet Mount Types | Flush Mount Helmet Camera | Unicorn Mount Helmet Camera |
|---|---|---|
| Helmet Mount Example | ![]() |
![]() |
| Skier Image Example | ![]() |
![]() |
| Skier Video Example | ||
Perspective Distortion Challenges
|
Worse
|
Better
|
Note: I believe that the fish eye effect is caused by the Insta360 lens.
OpenPose Skeletal Tracking on Different Images and Videos¶
from different camera sources
# Import Libraries
import cv2
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import os
import copy
import time
import shutil
from tabulate import tabulate
import json
import csv
from IPython.display import display, Video, FileLink, HTML
#from src import model
from src import util
from src.body import Body
#Body Pose model
body_estimation = Body('model/body_pose_model.pth')
/Users/mattconners/Documents/git/projects/cv_ski_video/src/body.py:19: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature. model_dict = util.transfer(self.model, torch.load(model_path))
#Load Images
# Load Single Ideal Image
test_image_ideal = 'images/Skiing_Videos_Images/ideal_POV.jpg'
# Load Single Unicorn Image
test_image_unicorn= 'images/Skiing_Videos_Images/Unicorn8.jpg'
#Load Single Helmet Cam Image
test_image_helmetcam = 'images/Skiing_Videos_Images/helmet3.jpg'
# Load Grid of Unicorn Images
image_paths_unicorn = [
'images/Skiing_Videos_Images/Unicorn1.jpg',
'images/Skiing_Videos_Images/Unicorn2.jpg',
'images/Skiing_Videos_Images/Unicorn3.jpg',
'images/Skiing_Videos_Images/Unicorn4.jpg',
'images/Skiing_Videos_Images/Unicorn5.jpg',
'images/Skiing_Videos_Images/Unicorn7.jpg',
'images/Skiing_Videos_Images/Unicorn8.jpg',
'images/Skiing_Videos_Images/Unicorn9.jpg',
'images/Skiing_Videos_Images/Unicorn10.jpg',
'images/Skiing_Videos_Images/Unicorn11.jpg',
'images/Skiing_Videos_Images/Unicorn12.jpg',
'images/Skiing_Videos_Images/Unicorn15.jpg'
]
image_paths_helmetcam = [
'images/Skiing_Videos_Images/helmet1.jpg',
'images/Skiing_Videos_Images/helmet2.jpg',
'images/Skiing_Videos_Images/helmet3.jpg',
'images/Skiing_Videos_Images/helmet4.jpg',
'images/Skiing_Videos_Images/helmet5.jpg',
'images/Skiing_Videos_Images/helmet6.jpg',
'images/Skiing_Videos_Images/helmet7.jpg',
'images/Skiing_Videos_Images/helmet8.jpg',
]
import matplotlib.pyplot as plt
# Paths to your images
image_paths = [
test_image_ideal,
test_image_unicorn,
test_image_helmetcam
]
# Titles for each image
titles = ['Video taken by third party', 'Unicorn Stick Helmet Mount', 'Flush Mount Helmet Cam']
# Load and display images
fig, axes = plt.subplots(1, 3, figsize=(18, 6)) # 1 row, 3 columns
for i, (img_path, title) in enumerate(zip(image_paths, titles)):
image = plt.imread(img_path)
axes[i].imshow(image)
axes[i].set_title(title)
axes[i].axis('off') # Hide the axis
plt.tight_layout()
plt.show()
# Correct paths
test_image_ideal = 'images/Skiing_Videos_Images/ideal_POV.jpg' # Path to the input image
output_image_path = 'images/skeletal_tracking_ideal_image.jpg'
output_csv_path = 'images/keypoints_ideal_image.csv'
# Enable GPU for acceleration
oriImg = cv2.imread(test_image_ideal)
# Check if image was successfully loaded
if oriImg is None:
raise FileNotFoundError(f"Image not found or unable to load: {test_image_ideal}")
candidate, subset = body_estimation(oriImg)
canvas = copy.deepcopy(oriImg)
canvas = util.draw_bodypose(canvas, candidate, subset)
print("Number of Keypoints Detected: ", len(candidate)) # number of keypoints
print("Number of Persons Detected: ", len(subset)) # number of persons
# Define body part names corresponding to keypoint indices
body_parts = [
"Nose", "Neck", "RShoulder", "RElbow", "RWrist",
"LShoulder", "LElbow", "LWrist", "RHip", "RKnee",
"RAnkle", "LHip", "LKnee", "LAnkle"
]
# Extract keypoints and print to a DataFrame
keypoints = []
for person in subset:
person_keypoints = {"Person": f"Person_{len(keypoints)+1}"}
for i in range(len(person)):
if person[i] != -1 and person[i] < len(candidate):
x, y = candidate[int(person[i])][:2]
if i < len(body_parts): # Check if index is within range of body_parts list
body_part_name = body_parts[i]
person_keypoints[f'{body_part_name}_x'] = x
person_keypoints[f'{body_part_name}_y'] = y
keypoints.append(person_keypoints)
# Create DataFrame
df_keypoints_ideal_image = pd.DataFrame(keypoints)
# Save the DataFrame to a CSV file
df_keypoints_ideal_image.to_csv(output_csv_path, index=False)
# Save the processed image with keypoints
cv2.imwrite(output_image_path, canvas)
# Print the DataFrame using pandas for a neat table
print(df_keypoints_ideal_image)
# Plot the image with keypoints
plt.figure(figsize=(6, 6))
plt.imshow(canvas[:, :, [2, 1, 0]])
plt.axis('off')
plt.show()
Number of Keypoints Detected: 18
Number of Persons Detected: 1
Person Nose_x Nose_y Neck_x Neck_y RShoulder_x RShoulder_y \
0 Person_1 532.0 279.0 557.0 301.0 474.0 289.0
RElbow_x RElbow_y RWrist_x ... RKnee_x RKnee_y RAnkle_x RAnkle_y \
0 392.0 347.0 361.0 ... 371.0 597.0 204.0 726.0
LHip_x LHip_y LKnee_x LKnee_y LAnkle_x LAnkle_y
0 576.0 498.0 498.0 587.0 371.0 700.0
[1 rows x 29 columns]
# enable GPU for acceleration
# Number of images
num_images = len(image_paths_unicorn)
# Number of rows and columns in the grid
num_rows = 3
num_cols = 4
# Create the figure
fig, axes = plt.subplots(num_rows, num_cols, figsize=(20, 20))
# Iterate through the images and perform body pose estimation
for i, image_path in enumerate(image_paths_unicorn):
# Read the image
oriImg = cv2.imread(image_path)
# Perform body pose estimation
candidate, subset = body_estimation(oriImg)
# Create a copy of the image and draw the body pose on it
canvas = copy.deepcopy(oriImg)
#canvas = draw_bodypose(canvas, candidate, subset)
canvas = util.draw_bodypose(canvas, candidate, subset)
# Determine the position of the subplot
row = i // num_cols
col = i % num_cols
# Plot the image with the body pose
axes[row, col].imshow(cv2.cvtColor(canvas, cv2.COLOR_BGR2RGB))
axes[row, col].axis('off')
# Adjust the layout
plt.tight_layout()
plt.show()
# enable GPU for acceleration
# Number of images
num_images = len(image_paths_helmetcam)
# Number of rows and columns in the grid
num_rows = 2
num_cols = 4
# Create the figure
fig, axes = plt.subplots(num_rows, num_cols, figsize=(20, 20))
# Iterate through the images and perform body pose estimation
for i, image_path in enumerate(image_paths_helmetcam):
# Read the image
oriImg = cv2.imread(image_path)
# Perform body pose estimation
candidate, subset = body_estimation(oriImg)
# Create a copy of the image and draw the body pose on it
canvas = copy.deepcopy(oriImg)
#canvas = draw_bodypose(canvas, candidate, subset)
canvas = util.draw_bodypose(canvas, candidate, subset)
# Determine the position of the subplot
row = i // num_cols
col = i % num_cols
# Plot the image with the body pose
axes[row, col].imshow(cv2.cvtColor(canvas, cv2.COLOR_BGR2RGB))
axes[row, col].axis('off')
# Adjust the layout
plt.tight_layout()
plt.show()
Detect Skeletal Keypoints on Videos¶¶
Video of Ideal Camera POV with Skeletal Tracking¶
# Correct paths
video_path = 'images/Skiing_Videos_Images/ideal_ski_pov1.mov'
output_video_path = 'output/ski_pose_ideal_pov.mp4'
output_json_path = 'output/keypoints_data_ideal.json'
output_csv_path = 'output/keypoints_data_ideal.csv' # Path for the CSV file
cap = cv2.VideoCapture(video_path)
fps = int(cap.get(cv2.CAP_PROP_FPS))
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
out = cv2.VideoWriter(output_video_path, fourcc, fps, (width, height))
# Start the timer
start_time = time.time()
# List to store keypoint data
keypoints_data = []
while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
candidate, subset = body_estimation(frame)
canvas = copy.deepcopy(frame)
canvas = util.draw_bodypose(canvas, candidate, subset)
out.write(canvas)
# Store keypoint data for each frame
keypoint_data = {
'frame': int(cap.get(cv2.CAP_PROP_POS_FRAMES)),
}
for i in range(len(candidate)):
keypoint_data[f'x_{i}'] = candidate[i][0] # Add keypoint x-coordinates
keypoint_data[f'y_{i}'] = candidate[i][1] # Add keypoint y-coordinates
keypoints_data.append(keypoint_data)
cap.release()
out.release()
# Collect all possible field names
fieldnames = set()
for data in keypoints_data:
fieldnames.update(data.keys())
fieldnames = sorted(fieldnames) # Sort field names for consistency
# Save keypoints data to a JSON file
with open(output_json_path, 'w') as f:
json.dump(keypoints_data, f, indent=4)
# Save keypoints data to a CSV file
with open(output_csv_path, 'w', newline='') as csvfile:
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
writer.writerows(keypoints_data)
# End the timer
end_time = time.time()
# Display runtime
print(f"Runtime: {end_time - start_time:.2f} seconds")
# Display the resulting video
from IPython.display import Video, display, FileLink
display(Video(output_video_path, embed=True))
# Provide a download link for the video
display(FileLink(output_video_path))
Video of Unicorn Camera POV with Skeletal Tracking¶
# Correct paths
video_path = 'images/Skiing_Videos_Images/unicorn_trees1.mov'
output_video_path = 'output/output_ski_pose_unicorn_pov.mp4'
output_json_path = 'output/keypoints_data_unicorn.json'
output_csv_path = 'output/keypoints_data_unicorn.csv'
cap = cv2.VideoCapture(video_path)
fps = int(cap.get(cv2.CAP_PROP_FPS))
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
out = cv2.VideoWriter(output_video_path, fourcc, fps, (width, height))
# Start the timer
start_time = time.time()
# List to store keypoint data
keypoints_data = []
while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
candidate, subset = body_estimation(frame)
canvas = copy.deepcopy(frame)
canvas = util.draw_bodypose(canvas, candidate, subset)
out.write(canvas)
# Store keypoint data for each frame
keypoint_data = {
'frame': int(cap.get(cv2.CAP_PROP_POS_FRAMES)),
}
for i in range(len(candidate)):
keypoint_data[f'x_{i}'] = candidate[i][0] # Add keypoint x-coordinates
keypoint_data[f'y_{i}'] = candidate[i][1] # Add keypoint y-coordinates
keypoints_data.append(keypoint_data)
cap.release()
out.release()
# Collect all possible field names
fieldnames = set()
for data in keypoints_data:
fieldnames.update(data.keys())
fieldnames = sorted(fieldnames) # Sort field names for consistency
# Save keypoints data to a JSON file
with open(output_json_path, 'w') as f:
json.dump(keypoints_data, f, indent=4)
# Save keypoints data to a CSV file
with open(output_csv_path, 'w', newline='') as csvfile:
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
writer.writerows(keypoints_data)
# End the timer
end_time = time.time()
# Display runtime
print(f"Runtime: {end_time - start_time:.2f} seconds")
# Display the resulting video
from IPython.display import Video, display, FileLink
display(Video(output_video_path, embed=True))
# Provide a download link for the video
display(FileLink(output_video_path))
Runtime: 668.16 seconds



